Skip to content

[SPARK-7586][ML][doc] Add docs of Word2Vec in ml package #6181

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

yinxusen
Copy link
Contributor

CC @jkbradley.

JIRA issue.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 15, 2015

Test build #32804 has started for PR 6181 at commit 1c3f389.

@SparkQA
Copy link

SparkQA commented May 15, 2015

Test build #32804 has finished for PR 6181 at commit 1c3f389.

  • This patch fails RAT tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • public class JavaWord2VecSuite

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32804/
Test FAILed.

@SparkQA
Copy link

SparkQA commented May 16, 2015

Test build #812 has started for PR 6181 at commit 1c3f389.

@SparkQA
Copy link

SparkQA commented May 16, 2015

Test build #812 has finished for PR 6181 at commit 1c3f389.

  • This patch fails RAT tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

{% highlight scala %}
import org.apache.spark.ml.feature.Word2Vec

val documentDF = sqlContext.createDataFrame(Seq(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment in line above:

Input data: Each row is a bag of words from a sentence or document.

(Please add to other examples too.)

@jkbradley
Copy link
Member

@yinxusen I tried the Scala example but got this error:

scala> val documentDF = sqlContext.createDataFrame(Seq(
     |   "Hi I heard about Spark".split(" "),
     |   "I wish Java could use case classes".split(" "),
     |   "Logistic regression models are neat".split(" ")
     | )).map(Tuple1.apply).toDF("text")
<console>:20: error: overloaded method value createDataFrame with alternatives:
  [A <: Product](data: Seq[A])(implicit evidence$4: reflect.runtime.universe.TypeTag[A])org.apache.spark.sql.DataFrame <and>
  [A <: Product](rdd: org.apache.spark.rdd.RDD[A])(implicit evidence$3: reflect.runtime.universe.TypeTag[A])org.apache.spark.sql.DataFrame
 cannot be applied to (Seq[Array[String]])
       val documentDF = sqlContext.createDataFrame(Seq(

Did these all run for you?

The Java file needs the Apache license at the top.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 17, 2015

Test build #32935 has started for PR 6181 at commit 57a4c07.

@SparkQA
Copy link

SparkQA commented May 17, 2015

Test build #32935 has finished for PR 6181 at commit 57a4c07.

  • This patch fails RAT tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • public class JavaWord2VecSuite

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32935/
Test FAILed.

@jkbradley
Copy link
Member

@yinxusen This needs the Apache license at the top of all files. (That is causing the RAT test failure.)

@jkbradley
Copy link
Member

@yinxusen Except for the RAT test issue, this LGTM. You may encounter merge issues from other updates to ml-features.md, but they should be easy to fix.

@yinxusen
Copy link
Contributor Author

@jkbradley Sorry for forgetting the Apache license again... fix it now.

@yinxusen
Copy link
Contributor Author

@jkbradley Seems it has no merging problem, for now.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 19, 2015

Test build #33050 has started for PR 6181 at commit 77014c5.

@SparkQA
Copy link

SparkQA commented May 19, 2015

Test build #33050 has finished for PR 6181 at commit 77014c5.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • public class JavaWord2VecSuite

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33050/
Test FAILed.

@SparkQA
Copy link

SparkQA commented May 19, 2015

Test build #830 has started for PR 6181 at commit 77014c5.

@yinxusen
Copy link
Contributor Author

@jkbradley Are there more concrete error infos for Mima error? I search from the console output of the test, but cannot locate the error.

@SparkQA
Copy link

SparkQA commented May 19, 2015

Test build #830 timed out for PR 6181 at commit 77014c5 after a configured wait of 150m.

@jkbradley
Copy link
Member

@yinxusen It's not actually a MIMA error; that's a bug in the script reporting on the tests. It looks to me like Jenkins being flaky.

@SparkQA
Copy link

SparkQA commented May 19, 2015

Test build #831 has started for PR 6181 at commit 77014c5.

@SparkQA
Copy link

SparkQA commented May 19, 2015

Test build #831 has finished for PR 6181 at commit 77014c5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • public class JavaWord2VecSuite

@jkbradley
Copy link
Member

LGTM merging into master and branch-1.4
@yinxusen Thank you!

asfgit pushed a commit that referenced this pull request May 19, 2015
CC jkbradley.

JIRA [issue](https://issues.apache.org/jira/browse/SPARK-7586).

Author: Xusen Yin <[email protected]>

Closes #6181 from yinxusen/SPARK-7586 and squashes the following commits:

77014c5 [Xusen Yin] comment fix
57a4c07 [Xusen Yin] small fix for docs
1178c8f [Xusen Yin] remove the correctness check in java suite
1c3f389 [Xusen Yin] delete sbt commit
1af152b [Xusen Yin] check python example code
1b5369e [Xusen Yin] add docs of word2vec

(cherry picked from commit 68fb2a4)
Signed-off-by: Joseph K. Bradley <[email protected]>
@asfgit asfgit closed this in 68fb2a4 May 19, 2015
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
CC jkbradley.

JIRA [issue](https://issues.apache.org/jira/browse/SPARK-7586).

Author: Xusen Yin <[email protected]>

Closes apache#6181 from yinxusen/SPARK-7586 and squashes the following commits:

77014c5 [Xusen Yin] comment fix
57a4c07 [Xusen Yin] small fix for docs
1178c8f [Xusen Yin] remove the correctness check in java suite
1c3f389 [Xusen Yin] delete sbt commit
1af152b [Xusen Yin] check python example code
1b5369e [Xusen Yin] add docs of word2vec
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
CC jkbradley.

JIRA [issue](https://issues.apache.org/jira/browse/SPARK-7586).

Author: Xusen Yin <[email protected]>

Closes apache#6181 from yinxusen/SPARK-7586 and squashes the following commits:

77014c5 [Xusen Yin] comment fix
57a4c07 [Xusen Yin] small fix for docs
1178c8f [Xusen Yin] remove the correctness check in java suite
1c3f389 [Xusen Yin] delete sbt commit
1af152b [Xusen Yin] check python example code
1b5369e [Xusen Yin] add docs of word2vec
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
CC jkbradley.

JIRA [issue](https://issues.apache.org/jira/browse/SPARK-7586).

Author: Xusen Yin <[email protected]>

Closes apache#6181 from yinxusen/SPARK-7586 and squashes the following commits:

77014c5 [Xusen Yin] comment fix
57a4c07 [Xusen Yin] small fix for docs
1178c8f [Xusen Yin] remove the correctness check in java suite
1c3f389 [Xusen Yin] delete sbt commit
1af152b [Xusen Yin] check python example code
1b5369e [Xusen Yin] add docs of word2vec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants